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ABSTRACT 

Clusters of galaxies produce negative features at wavelengths A > 1.25 mm in CMB 
maps, by means of the thermal SZ effect, while point radio sources produce positive 
peaks. This fact implies that a distribution of unresolved SZ clusters could be detected 
using the negative asymmetry introduced in the odd-moments of the brightness map 
(skewness and higher), or in the probability distribution function (PDF) for the fluc¬ 
tuations, once the map has been filtered in order to remove the contribution from 
primordial CMB fluctuations from large scales. This property provides a consistency 
check to the recent detections from CBI and BIMA experiments of an excess of power 
at small angular scales, in order to confirm that they are produced by a distribution 
of unresolved SZ clusters. However it will require at least 1.5-2 times more observing 
time than detection of corresponding power signal. This approach could also be used 
with the data of the planned SZ experiments (e.g. ACT, AMI, AMIBA, APEX, 8 m 
South Pole telescope). 

Key words: cosmology: cosmic microwave background - cosmology: observations - 
galaxies: clusters: general. 


1 INTRODUCTION 

Fluctuations in the Cosmic Microwave Background (CMB) 
radiation can provide information about hot gas in 
galaxy clusters over a wide range of redshif ts (Sunyaev 
& Zel dovich 1972,1980 (hereafter SZ)^ and iBirkinshawl 
Il99fll : ICarlstrom. Holder, fc Reesd 1200211 . On arcminute 
angular scales and smaller, the thermal SZ contribu¬ 
tion to the CMB anisotropy is e xpected to dominate 
that of the primary anisotropies l| Sunvae v fc Zeldovichl 
[l970|JS£ri ^el. Whitb fc Hernauistll200ll (hereafter SWH). 
Eold^2003) ~ A new generation of experiments is measuring 
the CMB sky at these ang ular scales. In parti cular, two 
recent experiments , BIMA llDawson et al ]|2^ and CBI 
iMason et alJ 1200211 . both observing at frequencies around 
30 GHz, have detected an excess of power in the multipole 
region i > 2000, where the SZ power is expected to be dom¬ 
inant over the CMB signal. Nevertheless, at these observing 
frequencies, radio point sources are kn own to also produce 
a significant contribution to the power l| Longair fc Sun vaevI 

ll969l : lFranceschini et alii 1 9891 : iToffolatt^'^ilTFoOljl if they 

are not subtracted properly from the CMB maps. The 
reported detections of power have argued that this point- 
source contamination is not a problem, thus suggesting 
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that t he signal could be d ue to the SZ effect Da^on et alJ 
I 2 OO 2 I : iBorid et alJ l2002t iKomatsu fc Selia^ 2002^ . These 
arguments are based on analytical models or simulations 
of what we would expect to measure. Thus, it would be 
interesting to explore, in a model-independent way, the 
nature of these contributions. The importance of th i s topi c 
has been stressed recently bv ICoorav fc Melchiorril ll2002ll . 
who suggested to use a cross-correlation of CMB maps with 
maps of the large scale structure. This idea has been ap¬ 
plied for this pu rpose to other datas ets with larger angular 
resolutions (e.g. |Banday_gt_alJll99(i to the COBE data, or 

[Rub ino-Martm. Atrio-Barandela. fc Hernandez-Monteagudol 
UOOfl to the Tenerife data). 

Here, we propose a general model-independent method 
to determine if the measured power excess in a single¬ 
frequency map is (mainly) due to point sources or SZ clus¬ 
ters. To this end, we use the fact that for frequencies below 
217 GHz (A > 1.25 mm), the thermal SZ effect produces 
negative features in the maps, while the point sources pro¬ 
duce positive peaks. We illustrate this fact with figure 0 
where we show two simulated one-dimensional maps, one 
of SZ clusters observed at z/ = 30 GHz, and the other one 
of point sources. Dotted lines show the original (without 
sources of any kind) zero level of fluctuations, while dashed 
lines show the observed (average) zero level once the mean 
of the map has been subtracted. With the same level of flue- 
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Figure 1. One dimensional map of a single realisation of SZ clus¬ 
ters (upper panel) and point sources (lower panel), observed with 
a gaussian beam of 8' full-width half-maximum, and no noise. 
Dashed lines show the average (zero) level of fluctuations, while 
dotted lines show the original zero level before subtracting the 
mean to the map. With the same level of fluctuations, the 
analysis does not permit to distinguish between these two cases. 
Therefore, we need to use the skewness, or to proceed with an 
analysis of the asymmetry of the P{D) curve in order to separate 
these two cases. 


tuations (same rms at the observed scale), a power spectrum 
analysis is not able to distinguish these two cases, so we need 
to use an statistic carrying information about the sign of the 
subjacent signal (e.g. the skewness) to suggest the nature of 
the objects producing this excess of power. The existence of 
negative skewness at A > 1.25 mm, while positive skewness 
at A < 1.25 mm, is a clear prediction for SZ clusters. 

We investigate here the discrimination between posi¬ 
tive and negative sources using the probability distribu¬ 
tion function (PDF) for the observed flux. From a given 
map, the PDF function can be obtained easily as an his¬ 
togram of the (normalised) number of pixels within a given 
flux interval. This tool has been widely used in radio as¬ 
tronomy when studying the statistica l properties of a back¬ 
ground of poin t sources llScheueiil957l : ICavaliere et alJlflT.ll: 
ICondonlll974) . because in that case the shape of this func¬ 
tion is strongly related with the statistical properties of 
the sources (i.e. theirs spatial distribution). In this con¬ 
text, this function is known as the ’deflection probability 


Figure 2. Example of the strong non-gaussianity of the P(D) 
function for SZ clusters. We present the P(D) function for a SZ 
map in the Rayleigh-Jeans region of the spectrum, where clusters 
are “negative” sources. For comparison, it is also shown the best 
gaussian fit to this P(D) curve (cr = 6.1 This curve will be 

explained in detail in section ItI 


distribution’, or the P{D) curve. This ’P{D) formalism’ 
has been su ccessfully app l ied to study the diffuse X-ray 
backg round iScheueilll97^: iFab ianlll 97,4 ICavaliere fc Settl 
1197(1 ICondon fc Dresse]| |l 97jjl . as well as to determine 
the contribution of discr e te point sources to C MB maps 
iFranceschini et Il98fll : iToffoIatti et al.l Il998h . For the 
CMB, if we assume the standard inflationary scenario, then 
the primordial fluctuations are gaussian, so the P{D) it¬ 
self is a gaussian, as well as for the standard instrumental 
noise. However, t he main characteristic o f this P{D) curve 
for point sources jFranc eschini et alJl989l) or for SZ clusters 
iCole fc Kaiser^ll988^ is its non-gaussianity. Typical curves 
for a P{D) distribution of point sources or SZ clusters will 
exhibit long tails (see Figurel^J- The point is that at A > 1.25 
mm, sources will produce a positive tail, while SZ clusters 
will give a negative one. It is important to mention that at 
A < 1.25 mm, both AGNs and SZ-clusters will produce pos¬ 
itive tails. Then it is necessary to use other characteristics 
of both populations (frequency spectra, etc) to distinguish 
them. As an illustration. Figure El demonstrates P(D) for 
SZ sources at four frequencies, zz = 107 and 150 GHz (where 
clusters are giving negative signal) and 270 and 520 GHz 
(where the signal from clusters is positive, and exactly op¬ 
posite in sign to the previous cases). 


2 SZ CLUSTERS AS NEGATIVE SOURCES 


It is possible to consider clusters of galaxie s as “extended 
sources” with a peculiar spectrum given by iSunvaevlll98fil 
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where tt is the optical depth for Thompson scattering, x = 
hn/kTcmb is the dimensionless frequency, with = zz(l-(- 2 ) 
and Tr = TcmbA + z), so that x does not depend on redshift 
z, and 
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Figure 3. P(D) function for SZ clusters at different frequencies. 
First panel shows the g{x) function (see equation and four 

frequencies {u = 107, 150, 270 and 520 GHz) where this function 
(and so the flux density) takes the same absolute value. Second 
panel shows the P(D) function for these four cases, using a sim¬ 
ple truncated power-law to model the cluster source counts (see 
section 4), with values n{S) = 28 (S/lJy)“^'® sr“^ at 150 

GHz, truncating at = 0.1 mJy, and with angular resolution 
9f) = 1'. The P(D) function is presented relative to its average 
value, so the distribution for the cases u = 107 GHz and 150 GHz 
is symmetric around zero respect to the other two cases. 

= Jo_ X fO) = - 4 ] ( 2 ) 

is the spectral shape factor. Note that this shape factor in¬ 
cludes the term from Bi,, the Planck function, and Slu/h, 
and fix) are the formulae for the CMB spectrum dis t ortion s 
due to Comptonization from l^ldovich fc Sunva^ illhhltl . 
From here, two different but equivalent approaches can be 
used to estimate the spectral luminosity of the cluster. We 
can obtain the spectral luminosity by just integrating the 
change of the CMB intensity due to scattering by individ- 
temperature T^{r) over the cluster volume 

L^yx,z) = aTg{x)il+z)^ATv f 0HHln^{r)Ddri'i) 

{hcj^ J m^c‘‘ 

This expression, for the case of isothermal intergalactic gas, 
is proportional to the total amount of electrons in the clus¬ 
ter, because in that case ^ J ne(r)Ddr = Miaa/iuemp), 


ual electrons of 



where Mica is the mass of the hot intergalactic gas, and 
is the mean molecular weight per electron. The impor¬ 
tant point here is that clusters increase rapidly their spectral 
luminosities with redshift oc (1 -I- z)^). Taking into ac¬ 
count the luminosity distance to the source dL{z), we can 
obtain the spectral flux as Si, = ' 

On _ the _ other h and, according to 

iKorolev. Sunva~ fc YakubsevI 1^8^, we can use the 
central value of the Comptonization parameter for the 
cluster 

yc = [ aTn,iil)dl (4) 

J TTleC 

For a given yc, the surface brightness of the cluster does not 
depend on redshift. Then, the flux from the cluster is equal 
to 

S,{x) = 2iTel^j^gix)ycYie/eo) (5) 

where Oq corresponds to the angular dimension of the cluster 
core radius, and the ¥{6/60) function takes into account 
the angular dependence of y over the cluster image. One 
important conclusion from this point of view of the problem 
is that clusters with given physical parameters should have 
a minimum flux at some redshift, due to the well-known 
redshift dependence of the angular dimension of the cluster 
with a given core radius. In the Universe with 17 = 1 we have 
minimum angular dimension at z = 1.25 and at higher z, 
both angular distance and the flux will increase: clusters 
with given physical parameters have minimum flux where 
its angular dimension is minimum. It is important to note 
that 

(i) for an experiment with an angular resolution 9h larger 
than the core radius of clusters {6b S> ^o), they will be un¬ 
resolved objects, and therefore they will appear as point 
sources for us; 

(ii) according to the dependence of g{x) from x, these 
point sources will have “positive” flux at A < 1.25 mm, and 
“negative” flux at A > 1.25 mm. 

(iii) for given physical parameters, clusters show a min¬ 
imum flux at that redshift 2 where the angular dimen¬ 
sion is minimal. On the other hand, the observed source 
counts depend on the luminosity functi on. For instance, if 
we u se the Press-Schechter formalism iPress fc Schechteil 
I 1974 L hereafter PS), then we have a divergence of the (co¬ 
moving) number density of objects at low masses (i.e. low 
fluxes). Nevertheless, we know that cooling and feedback 
play an important role in the SZ predictions, so we do 
not expect to find bright SZ clusters with masses below 
few times IO^^Mq. Then, it is necessary to introduce a 
low mass cutoff in PS formalism in order to derive realis¬ 
tic SZ predictions, as it has been done by several authors 


(e.e. 

de Luca. Desert. Kr. Pue'etlll995l:lKomatsu Kitavamal 

|l99c 

iMolnar & Birkinshawll200fll. Thus, we exnect to ob- 


serve a minimum flux and a minimum angular dimension for 
SZ clusters. 


Several authors have studied the contribution of the SZ 
effect to the power spectrum of CMB fluctuations at small 

^ For a cosmological model with = 0.3 and £7^ = 0.7, this 
happens at z = 1.60. 
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scales, both the oretically (e.K.|Cooraj2001 ) or using simula¬ 
tions (e.g. SWH. I^hang. PenA^Wana2002 ). In any case, the 
main observational emphasis is put on the power spectrum 
(Ci), because it is easier to measure than, for example, the 
bispectrum. When determining if the excess of power at low 
scales detected by CBI and BIMA is due to SZ clusters, the 
comparison has been done in terms of t he power spectrum 
JPawson et alJ l2002l : iBond et alJ l2002l : iKomatsu fc Seli^ 
l20n2^ . However, when we are working with the power spec¬ 
trum, we are losing information about the sign of the fluctu¬ 
ations, so we need to find the skewness, or to proceed with 
an analysis of the asymmetry of the P{D) function. 


3 STATISTICAL DESCRIPTION OF 

RANDOMLY DISTRIBUTED 

POSITIVE/NEGATIVE SOURCES 

The formalism relating the (differential) source counts and 
the PDF (or P{D) function) of the observed deflection D at 
a given point, due to a population of poisso nian-distribute d 
unresolved sourc es, was Erst dis cussed by IScheiierl ill Q.'iTii , 
and extended bv ICondorJ lll974l . This analytical approach 
will permit us to understand much more deeply the observed 
shapes of the P{D) curves, and to relate them with the 
underlying source counts. 

Using the standard notation from radio astronomy, let 
n{S) be the differential counts per solid angle, at a given 
frequency v, and let b{9, R) the response of a radio telescope 
to a point source (normalised to 1 at the peak). Let s = 
Sb{9, R) be the response of the instrument to a source of 
flux density S located at a given distance (9, R) of the beam 
centre. Then, the mean number of source responses of flux 
between s and s -I- ds in the beam, R{s), is given by 


R{s) 


b{9,^) 


dQ, 

W^) 


( 6 ) 


The relationship between n{S) (or R{s)) and the P{D) func¬ 
tion, for the case of a pencil beam antenna, is given in terms 
of the characteristic functions of R{s) and P{D), which can 
be written as r{w) and p(ui), respectively. The equations are 


r(w) = 


R(s) exp(27ritcs)ds 


p(nj) = exp[r(w) — r(0)] 


(7) 

( 8 ) 


and the P{D) function can be obtained as the inverse Fourier 
transform 


P{D) = 


p{w) exp{—2niwD)dw = 


exp(r(w) — r(0) — 2T:iwD)dw 


(9) 


of positive sources n{S), with S > 0, is described by the 
function P{D), then the population of sources ndSI), with 
S < 0, is described by P{—D). Thus, in this section, we will 
restrict ourselves to study distributions of positive sources 
{n{S) = 0 for S' < 0), given that we can obtain the corre¬ 
sponding distribution for negative sources with the transfor¬ 
mation P{D) P{—D). 

All the equations described through this sec- 
tion are implicitly ass u ming non-resolved objects. 
iRowan-Robinson fc FabianI (Il974l have studied the 
modihcations introduced by extended sources, showing that 
the condition for unresolved objects applies until 9i, ~ 9s, 
where ds is the typical size of the source, and 9b the beam 
size. 


3.1 Analytical cases 


Let us consider here several particularly simple but useful 
cases for the n{S) function, which can be treated analyt¬ 
ically. These cases will be used later. For a more general 
case of source counts, we can use a Monte Carlo me thod to 
work out the P(D) distribution (e.g. iHewishI (Il96 jU . First 
of all, we will consider a power-law shape, n{S) = KS~^, 
with S' > 0. As a second case, we will also consider a trun¬ 
cated power-law at a certain flux density So, i.e. n(S) = 0 
for S < So, and n(S) = KS~^, for S > So. Hereafter, we 
will assume that S is given in Jy, so we are implicitly writ¬ 
ing n(S) = K{S/lJy)~^, and the units for K are Jy”*^ sr~^. 
We will also assume a gaussian antenna pattern, described as 
b{9) = exp(—i(d/cr6)^), where as is the width of the beam, 
and 9b = \/8 log 2ab its full-width half maximum (FWHM). 

In these cases, the R(s ) -funct ion can be analytically 
obtained. Following ICondonI (Il974h . we dehne the effective 
solid angle (fie) as 

= ( 10 ) 

We then obtain R{s) = KQeS~^ for a pure power-law, and 





s > So 
s < So 


( 11 ) 


for a power-law truncated below So and the gaussian beam. 
From these equations, it is straight-forward to obtain numer¬ 
ically t he P(D] functi on. T hese problems h ave been stud¬ 
ied by IScheue 3 <ll957^ and J^ndorJ il974i for th e power 
law ca se^, and bv Tscheu^ lll974l fanalytically 1 and iHewishI 
il96 ji (numerically) for the truncated case. 

Finally, it is also interesting to consider the case of a 
power-law source counts with an upper cut-off in flux. Sc. 
This is the expression to use in order to compute the P{D) 
function from a map where the brightest sources have been 
subtracted down to the flux Sc. Therefore, if n(S) = KS~^ 
for So < S < Sc, and 0 elsewhere, then 


This relation can b e also employed in the cas e of track¬ 
ing interferometers iFomalont et al ] ll98Sl . Il99.'jl . using the 
CLEANed map. For the case of a phase switch interferom¬ 
eter, the above relations still ho lds, but repla cing Fourier 
transforms by Bessel transforms (IScheuerlll957^ . 

It should be noted that these expressions are general, 
and therefore valid for the case of negative sources. Indeed, 
from the previous equations it follows that if a population 



KnsS-\S^o~^ - 

ADcS-'5(1-(^)^-'), 

0 , 


s < So 

So<S<Sc (12) 
s > Sc 


^ For the power law case, Condon (1974) gives an analytic ex¬ 
pression for p{w), which is valid for 2 < < 3, so in these cases 

the numerical calculation is even easier. 








































Discriminating between unresolved point sources and “negative” SZ clusters 5 


3.2 Confusion noise 


Apart from the P(D) function, it is also interesting to char¬ 
acterise the properties of the source population by the mo¬ 
ments of the R{s) distribution. For a pure power-law expres¬ 
sion for the number-flux-density relation, it is clear that the 
nth-moment of the R-distribution is 

<s">=/ s^R{s)ds= ^ n= 1,2, 3, ...(13) 

Jo n + 1 [i 


where Sc represents the cutoff value for point-source subtrac¬ 
tion. For the truncated power-law case, the nth-moment of 
this distribution is 


n + 1 — f3 


/ 3-1 

n(n -\-l — 0) 




n+l—P 

0 


(14) 


The second moment of the R{s) function (cr^ =< >) has 

been extensively used to characterise the ’confusion noise’, 
i.e. the noise due to the presenc e of faint unresolved sources 
inside the beam JScheuerlll957ll . Normally, the adopted cri¬ 
terion for the detection of sources is such as the intensity 
q times the sigma of the map, so the minimum subtraction 
threshold can be written as Sc = qcTc, being g = 3 — 5 the 
usual values. Thus, inserting this condition in equation lO, 
we obtain for the power-law case: 


m{q) 


q3-0 

3 -/ 3 ' 


H l/(/3-l) 




(15) 


where we explicitly write that Uc depends on q for this choice 
of the subtraction threshold. Note that this threshold can be 
decreased if we use measurements at higher angular resolu¬ 
tion. We will discuss this in section]^ 

For the power-law source counts, it is easy to show that 
the flux at which we have one source every X beam areas, 
Sx, can be written as 

/ . i/(/3-i) 

Sx{jy) = (16) 


From these last two equations, we immediately see that for 
the flux level Sc = qac(q), we have one source every g^/(3 — 
/3) beam areas. For the particular case of /3 = 2.1 and g = 5, 
this expression takes the value ~ 28, so we recover the well- 
known result that if we have more than one source of a given 
flux every 30-40 beam areas (it depends on /3), we will be 
limited by confusion noise. We should remind here that the 
P{D) function provides information (roughly) dow n to the 
flux 5 *1 at which we have one source per beam area iScheueil 

[ 1973 ). 


3.3 Scaling of P(D) with frequency 

To conclude this section, we will derive the scaling of P{D) 
with frequency for SZ clusters. This scaling can be derived 
from the one for the source counts, which is given by 


n{S; v) 


g{xo) 

9{x) 


n 


^ gpo) 

giD 



(17) 


where Xq = hvo/kTc r nh- Th is equation is similar to the one 
obtained bv ICondo'i] (ll984^ for the scaling of the differential 


source counts of point sources with power-law spectra {S oc 
v~°‘) From here, we obtain 


P{D-v) 


gjxo) 

gD) 


p 



(18) 


We see that clusters of galaxies should be described by a sin¬ 
gle P{D) function, which is the same at all frequencies (but 
rescaled) if Of, is the same, and which is equivalent to the 
PDF for the y parameter. However, our description permits 
the use of the main characteristic of the effect, the existence 
of negative sources. Therefore, if we compare data from two 
frequencies, one above v — 217 GHz and the other one be¬ 
low, then in the first case the P{D) for clusters will exhibit 
a positive tail while in the second case, a negative tail, being 
in both cases described by the same (rescaled) P{D) func¬ 
tion. If we now use this expression to derive the moments of 
the P{D) function, we obtain 

<D"-vo>, n=l,2,3,... (19) 


< D"-u>= 


gjp 

gixo) 


Therefore, we explictily see that the normalised moments of 
a map of thermal SZ clusters (with no noise), < D" > /cr", 
are exactly the same in magnitude for all frequencies, but we 
have a change of sign in all the (normalised) odd-moments 
when we cross the frequency i/ — 217 GHz. 


4 SOURCE COUNTS FOR RADIO SOURCES 
AND CLUSTERS 

As we have seen in the last section, the shape of the P{D) 
function for (positive/negative) sources provides a means 
of determining the underlying source counts. Therefore, we 
will discuss here which are the typical source counts both 
for radio sources and SZ clusters. 


4.1 Radio point sources 

Several authors (e.g. iFischer fc Lang3 Il99,3ll have studied 
the contribution to the confusion limits from different popu¬ 
lations of extragalactic sources. In the context of GMB mea¬ 
surements at frequencies close to r/ ~ 30 GHz, radio sources 
are known to produce the main contribution to the confu¬ 
sion noise. Because of this reason, recent experiments have 
used a source subtraction technique, and thus have produced 
source counts for these radio sources. These curves are well 
fitted by power-laws in the flux density region around a few 
mjy. Typical values for the K and 0 parameters, at frequen¬ 
cies around 30 GHz, are n{S) rs 54 (S '34 GHzllJy)~^'^^ sr“^ 
Jy~^, for Sm GHz > 60 mJy (from the VSA experiment, 
iTavlor et all2002^ . and n(5') « (92±23) (S'ai GHz/lJy)~^ ° 
sr~^ Jy~^, for 5*3 1 ghz > 5 mJy (from the GBI group, 
iMason et alJl200^ . Apart from these source counts, we can 
also extrapolate th e uJv source counts at 8.4 GHz from 
VLA observations llFomalont et alJ l2002^ up to 30 GHz. 
Using the spectral index a = 0.5 [S ~ we obtain 

® Indeed, this relation holds for all cases where the frequency 
dependence of the observed flux of the object can be factorised, 
i.e. Si, = g(i^)<t>, where <I> does not depends on u. 
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Differentiol Source Counts ot 30 GHz 



Figure 4. Differential source counts for SZ clusters at 30 GHz, 
normalised to the Euclidean slope. We show the results from 
a Press-Schechter prescription with Mmin = 5 x Mq 

and (78 = 0.9 (see section o for details), as well as the source 
counts from the paper of SWH (squares) using hydrodynamic sim¬ 
ulations. Both source counts for clusters compare well in shape 
around 1 mjy, although the hydrodynamic simulations show a 
~20% less objects at these fluxes, and do not show an strong 
cutoff at low fluxes. For comparison, we also present here the dif¬ 
ferential source counts for radio sources at 30 GHz from several 
experiment: VSA (Taylor et al. 2002), at 34 GHz; CBI (Mason et 
al. 2002) at 31 GHz; and VLA (Fomalont et al. 2002) at 8.4 GHz. 
The source counts for this last experiment have been extrapolated 
up to 30 GHz using their mean spectral index a = 0.75. From the 
simple observation of these source counts, we expect that exper¬ 
iments with high angular resolution are going to be dominated 
by radio sources, if they do not consider a source subtraction 
strategy. 


niS) « (8.4 ±0.8) (530 g«./ 1 sr’i Jy-^. All 

these source counts are summarised in Figure 0] 

We should mention that the contribution of radio 
sources to the power (and thus to the skewness) is decreas¬ 
ing with frequency in power where a is the spectral 

index of radio sources. Therefore, the contribution of ra¬ 
dio sources to the observed map becomes less important at 
higher frequencies R ^ 30 GHz), while the contribution 
of clusters does not depend on frequency in the Raileigh- 
Jeans (RJ) region of the spectrum. It is important to remind 
that other populations of sources (e.g. thermal dust emission 
from galaxies) contribute to the source counts at higher fre¬ 
quencies. However, we expect that future experiments like 
ALMA will show us those populations very precisely. 

Given these source counts, the shapes of the correspond¬ 
ing distribution functions are characterised by long positive 
tails, as we have seen in the p revious section . Inclusion of 
the clustering effects of sources llBarcon sll992ll broadens the 
shape of the P{D), but the important point here is that the 
long positive tail is still maintained. 


4.2 SZ clusters 

I Korolev, Sunvacv. fc YakubsevI lll98fill discussed count 
curves for thermal SZ clusters, and showed that 
they differ strongly from the case of radio sources. 
Ide Luca. Desert, fc Pue^ (ll99!^^ haye deriyed the source 


counts for the thermal SZ effect using the Press-Schechter 
mass function and assuming unresolyed single-type clusters. 
For the scaling of the temperature with the mass of 
the cluster, they use Te ~ ± z), so Sv oc M®/®. 

These numbers are in agreeme nt with those obtained from 
recent X-ray obserya tions ifMohr, M ^hiesen. fc Evrardi 
I 1999 I : lEttori. De Grandi. fc Molendifl^OO^l . and haye been 
show n to fit simultan e ously optical and X-ray cluster 
data iPiego et al.ll2nnih . Ide Luca. Desert, fc PugCT (ll99.'^^ 
show that typical curyes can be well fitted by Euclidean 
power-laws n{S) = K\S\~'^'^ down to a few mJy, with 
typical yalues of A ~ 0.44 sr“^ Jy~^ (extrapolated down 
to 30 GHz = 1 cm), and introducing a low-flux cutoff of 
So ~ 0.1 mJy. SWH, using hydrodynamical simulations, 
came to similar conclusions, although their results do not 
show an exact power-law behaviour, and the low-flux cutoff 
is one order of magnitude smaller. For illustration, we show 
in Figure 31 these source counts, and our derived source 
counts for PS clusters, together with the radio source counts 
described in the last subsection. The qualitative behaviour 
is the same pointed above. These curves will be described 
in detail in Section [7| Here, we will point out two general 
aspects of any modelling of SZ clusters. 

First, we should stress that the value for So in each 
model (semi-analytic or numeric) depends on the chosen 
mass cutoff, Mmin, i.e. the minimum mass of an object con¬ 
tributing to the SZ effect (see discussion at the end of section 
n. This minimum mass {Mmin) is related to the minimum 
flux (Smin) observed in a given cosmology. Using equation 
Qi and assuming that the gas in the cluster is isothermal, 
we obtain the expression for the total SZ decrement Stot for 
a galaxy cluster, as a function of its mass 

_ 2{kTnmb)^ gix)(TT kT^ Mfg 
(/ic)2 d\{z) meC2 


where dA{z) stands for the angular diameter distance, 
and fg is the gas mass fraction. Using the scaling Te = 
Teo(M/Mo)^'^®(l ± 2 ) pointed above, we can derive for the 
Rayleigh-Jeans region of the spectrum that 


SPJ = -1.9 X lOR 


-iR 


TeO 


I 30 GHz / I 9 X 10^ A 


/ Mo ) 

I IQlS/l-lMg ] 



1 Mpc 
dA{z) 


Jy(21) 


From here, it is straight-forward to infer the minimum flux 
for a given Mmin- As an example, if we use the standard 
values h = 0.7, Qm ~ 0.3, Ha = 0.7, and we assume a 
constant value fg = 0.1, then we find that for a given mass 
we have the minimum flux at 2 « 0.98, and its value is 


I I 

I ^min \ 


20 


Mn 


5/3 


Mo 


mJy 


at 30 GHz. Typical values for Smin can b e seen in Ta¬ 
ble Q iKorolev. Sunvaev. fc YakubsevI Jl98fill were consid¬ 
ering Te independent on redshift, so Smin in their case 
was reached simultaneously with the minimum of angular 
diameter. Here we use a dependence of Te with redshift 
(oc (1 ± 2 )), and hence the minimum is reached when the 
function (1 ± z)I(Ra{z) takes its minimum. 
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Table 1. Dependence of the observed flux cutoff for clusters with 
the minimum mass of a cluster, for z/ = 30 GHz (see details in 
text). 


Sdmin (A7q) 

|5'7TT,i7T,| (llljy) 


0.238 

5 X IQi® 

0.075 

1013 

0.005 

3 X IOI 2 

0.0007 


Finally, we will mention that the inclusion of large scale 
clustering in the simulations has been studied by several au - 
thors (e.g. ICole fc KaiserlllQsS IZhang. Pen fc Wanell2nnJ) . 
As we pointed in the last subsection, clustering broadens 
the shape of the P{D) function respect to the poissonian 
case. In any case, these studies confirmed that the PDF for 
y parameter is always characterised by a long positive tail, 
which corresponds to a negative tail when A > 1.25 mm, 
independently of whether we include clustering or not. 


5 CONTRIBUTION OF THERMAL SZ 

SOURCES TO THE BISPECTRUM OF CMB 
ANGULAR FLUCTUATIONS 

A detailed study of the n on-gauss i an asp ects of the thermal 
SZ effect can be found in lCoora^ i200lll . Here we will show 
that the result about the change in sign from the previous 
section holds for the bispectrum, or indeed any odd-moment 
of the distribution. 

We hrst decompose the temperature anisotropy in 
the spherical harmonics basis, so we have ST/Tent(n) = 
aemYtmin)- From here, the bispectrum is usually de- 
Hneclas_B3(£imi,f'2m2,^3m3) =< ati U^27TT.2 ^-^ 3117.3 
Iul3ll994ri . For the case of a thermal SZ sky, the temper¬ 
ature anisotropy will be given by STjT^mb = fiDv^ where 
/ = xcoth{x j2)—A. In this equation, all the frequency depen¬ 
dence is factorised in the f{x) function. Given that the de¬ 
composition in the Yem-basis is unique, we can conclude that 
the af^-coefScients will satisfy the relation af^ = f(x)yem., 
where the ye-m quantities correspond to the coefficients of the 
decomposition of the y function. In this way, we can write 

si^{£imi,£2m2,Isms', v) = 

3 

{£rmi,£2m2,£sm,s',vo) ( 22 ) 

where we explicitly see the change in sign when we pass 
through u = 217 GHz. A similar relation also holds for all 
the higher odd-moments of the quantities. 

Therefore, we expect an overall change of the sign of the 
contribution of SZ clusters to the bispectrum when compar¬ 
ing two maps, one observed at A < 1.25 mm and the other 
one at A > 1.25 mm. However, for the case of SZ clusters, we 
would expect a larger value of the non-gaussian features in 
real space. The reason is that clusters are localised objects 
in real space, but when averaging modes in Fourier space, 
the resultin g non-gaussianity is diluted . As an illustration 
of this fact. lzhang. Pen fc Wan'S ^2002^ show that the kur- 
tosis is (roughly) twice larger in real than in Fourier space 
when comparing angular scales around £ ~ 1000, while for 


f{xo) 


£ > 6000, kurtosis in Fourier space goes rapidly to zero. 
Hence, it is better for the detection of negative skewness 
to work directly with real-space statistics, with the advan¬ 
tage that they are easier to infer from data. This a pproach 
has be en used by several authors, and in particular iGooravl 
i200 gives the relationship between the bispectrum and 
the skewness of the map smoothed on some scale with a 
given window function. The important point here is that 
this skewness, Hltered at some scale, will exhibit the same 
sign-dependence with frequency. 

Summarising these sections, the main characteristics of 
SZ clusters are the long negative tails of their brightness 
distributions at frequencies below 217 GHz, the absence of 
sources in the vicinity of A = 1.25 mm, and the existence of a 
positive contribution at A < 1.25 mm. In addition, skewness 
(or any odd-moment of the map) will retain the net sign of 
the effect. 


6 ESTIMATORS WHICH DISCRIMINATE 

THE SIGN OF THE SOURCES 

In this section, we will be interested in the problem of de¬ 
termining if an excess of power in a map is due to positive 
of negative source. Therefore, we will not be interested in 
identifying individual features in the maps, but in an aver¬ 
age contribution. As has been pointed out, we will use here 
the P(D) function. This function, for a given map, can be 
estimated by selecting a reasonable flux interval AD, and 
computing an histogram (number of pixels with a flux be¬ 
tween D — AD/2 and D + AD/2). 

Let Ps{D), Psz{D), Pcmb{D), and P„{D) be the dis¬ 
tribution functions for the point sources, the SZ clusters, 
the CMB and the instrumental (plus atmospheric) noise, 
respectively. The observed P{D) function will then be given 
by their convolution, 

P{D)^Ps{D)*Psz{D)*P,mb{D)*P„iD) (23) 

The (primordial) CMB distribution function is assumed to 
be a Gaussian, although for the considered angular scales 
{£ > 2000), it is expected to produce a negligible contribu¬ 
tion compared with the SZ or with the noise. We will discuss 
this point in detail in section uni The noise is also assumed 
to be gaussian distributed. This is a reasonable assumption 
for single dish radio-telescopes, and drift-scan interferome¬ 
ters, but also can be used for CLEANed images of tracking 
interferometers. Thus, the expected non-gaussianity in the 
P{D) is introduced by sources (positive or negative), whose 
distributions are characterised by skewed shapes. Therefore, 
if we want to detect this asymmetry, and in particular, its 
sign, one could use one of the following estimators: 

• Asymmetry (A) of the observed P{D) distribution. 
This quantity can be estimated directly as the difference in 
area between the positive and negative regions: 

^+00 pDp 

A= P{D)dD - / P{D)dD (24) 

J Dp J —00 

where Dp stands for the value at which the P{D) function 
peaks. Previous equation assumes that P{D) is normalised 
to unit area, i.e. f P{D)dD = 1, so A directly gives the 
fractional difference in area. It should be noted that D is 
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usually quoted with respect to the deflection around the 
mean level (D), so once we have the P{D) function, we make 
D ^ D — D, with D = J DP{D)dD. 

• Non-gaussianity of the wings. If we obtain the P{D) 
function, we could test the presence of point sources/SZ clus¬ 
ters even in the case when they produce a mutually can¬ 
celling asymmetry. This can be done by comparing the pos¬ 
itive/negative tail of the distribution with the one expected 
from gaussian noise. This excess could be quantified as: 

P + OO 

A+= [P{D) - G{D)]dD (25) 

J Dp 

for the positive tail, where G(D) is the expected distribution 
if we only have noise (normally assumed to be gaussian), and 
a similar equation for the negative one. 

• Skewness of the observed map. This cumulant has in¬ 
formation about the overall sign of the features producing 
the deviation from gaussianity. This quantity can be esti¬ 
mated using the third centred moment of the data: 

^pix 

E[M3]^^f]{xi-Elx]f (26) 


where f5[...] means that this is an estimator of the quantity 
inside brackets, is the number of pixels of the map, Xi is 
the measured flux density at pixel i, and E\x] = Xi 

is the standard estimator for the mean of the distribution. 
From here, the skewness is obtained as Skew = Ms/cr®, 
where a is the rms of the data. Equation is a biased 
estimator of the third moment of the population, but for 
large Npix it converges to the true M 3 value. Assuming an 
underlying gaussian PDF, the variance of this estimator (to 
lowest order in l/Npi^;) is Var{E[M 3 ]) = j^—Var{xY' (see 
iKesden. Coorav fc KamionkowsO (|2003)). 

If the P{D) is known, any moment of the distribution can 
be derived from it, and in particular, the skewness can be 
written as 


Skew = 


J D^P(D)dD 

7 7 ^ 

/ D^P{D)dD 


(27) 


• Bispectrum of the observed map. We will concentrate 
here on the quantities B£ = Beu, which in the case of statis¬ 
tical isotropy are related to the B 3 function defined above 
as 

B3(4mi,^2m2,^3m3) = I | (28) 

where the (...) is the Wigner 3 — j symbol. In particu¬ 
lar, we will be inter ested in the dimensionless bispectrum , 
It = BtlC\^'^ (e.g. iFerreira. Magueiio. fc Gorskil Jl99ljl l. 
The absolute value of this quantity will be the same for 
any frequency, while it will change its sign when we are ob¬ 
serving above or below A = 1.25 mm. A fast and efflcient 
method to compute the angular bis pectrum up to £ ~ 100 
for maps on the sphere is described in JKomatsu et all J2002ll , 
and applied to GOBE data. However, for the case of small 
patches of sky (as is the case for CBI or BIMA), we can use 
the flat-sky approx imation, and the estimator described in 
ISantos et alJ yO^. 


Any of the above estimators is able to detect an excess of 
positive unresolved sources over negative ones (or viceversa). 
However, the study of the P(D) function is preferable to the 
computation of skewness, given that it contains much more 
information. Unfortunately, obtaining the P{D) function for 
sources/clusters from noisy data requires more integration 
time that just the detection of skewness, as we will see in 
section 


7 SOURCE COUNTS, P(D), SKEWNESS AND 

BISPECTRUM OF SZ CLUSTERS 

P(D) analysis gives much more information than the power 
spectrum about the sources of GMB fluctuations, and even 
than skewness or the bispectrum. Therefore, we will describe 
below the expected P(D) functions, skewness and bispec¬ 
trum for simulated maps of SZ clusters. It is clear that ob¬ 
servers should make a lot of effort to get all the information 
about P(D), but the results which we will describe here will 
make it easier to understand our predictions in the following 
subsections. 

In Figure 1^ we present the results of simple modelling 
of SZ thermal effect following the model where “negative” 
clusters are assumed to have truncated power law source 
counts and “positive” radio sources have power law source 
counts up to very low fluxes. In both cases It is possible to 
compute P(D) numerically from equations in Section For 
definiteness, we use here So = 0.1 mjy, which roughly corre¬ 
sponds to Mmin = 5 X 10^^ Mq. We are implicitly assuming 
here that no source subtraction has been carried out on the 
map, so the width of the P(D) functions is directly related 
to the source confusion in each case. 

We consider here two cases: Ob = 10', which is roughly 
the angular resolution of current experiments (like CBI), and 
9b = 1 ', which corresponds to the angular resolution of fu¬ 
ture experiments dedicated to measure clusters (ACT, AMI, 
AMIBA, APEX and 8m-South Pole Telescope will have an¬ 
gular resolutions of 1.7', 1.5', 2', 0.8'and 1.3', respectively). 

For the beam width 9b = 10', the negative sources with 
truncated source counts produce practically no difference 
with the P(D) for the “negative” source counts with the 
same slope extended to zero flux. This is because for this 
case, the typical flux cutoff is well below the flux which gives 
the maximum contribution to the P(D). We see immediately 
that the “positive” radio sources alone have completely dif¬ 
ferent P(D) distribution than ’’negative” SZ-clusters. For 
the beam width 9b = 10' contribution of positive sources in 
the total power is larger than the contribution of “negative” 
SZ clusters, therefore the positive wing of P(D) is similar to 
the P(D) for sources only. However, the negative wings are 
very different. 

For the beam width 9b — 1', the negative tail of the 
P(D) function of clusters is becoming more important than 
in the last case. The reason for this dependence is the follow¬ 
ing: once we have specified the shape of the source counts, 
and we define the experiment (i.e. we specify 9b), then the 
P(D) function is completely defined, and it has the domi¬ 
nant contribution coming from the flux range between the 
flux at which we have one source per beam area, and the flux 
at which we expect one source every 30-40 beams. Thus, 
for smaller beam areas we are sampling lower fluxes where, 
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Figure 5. P{D) functions derived from analytical modelling of 
source counts, for: (a) “negative” sources described with a power- 
law with parameters K = 1 Jy“^ sr“^, (S = 2.5, and a low-flux 
cut-off of So = 0.1 mJy (dashed line); (b) same as before, but 
without considering the cut-off {So = 0 , dot-dashed line); (c) 
“positive” sources following a power law with if = 54 Jy“^ sr”*^ 
and /9 = 2.15 (dotted lines); (d) sum of the maps of clusters (a) 
and sources (c), so the P(D) is the convolution of those two cases. 
The beam is assumed to be gaussian, with a df, of 10' in the upper 
panel, and 1' in the lower one. The deflection D is refered to D in 
all cases, so all these curves give D = 0. Comparing cases (a) and 
(b), we can see the effect of a flux density cut-off on the shape 
of the P{D) function. A cut-off becomes of importance when we 
observe a low angular resolution. For these two values of 65 , the 
asymmetry introduced by radio sources dominates, because for 
these values of the spectral indices they are more numerous at a 
given flux, although for the low value of 81 , (lower panel), clusters 
increase their relative contribution. However, figure shows that 
studying the P{D) function provides information of both positive 
and negative contributions. 


according to figure |31 clusters become more numerous com¬ 
pared with radio sources (at a given flux). 

We would also mention that if we go deep enough in 
we would reach the low flux cut-off for clusters, and positive 
sources will dominate again. The existence of a low flux cut¬ 
off is clearly seen in the figure for 9h = 1', where the shape 
of the P{D) function is sensitive to the cutoff flux, and thus 
is completely different from the shape that we would expect 
if we had no truncation flux. Therefore, we can see that 
P{D) at these angular resolutions will also give us informa¬ 


tion about the truncation in flux for SZ clusters, and hence, 
about the low mass cutoff. 

In the following subsections, we will explain in more de¬ 
tail these general aspects by using simulations of SZ clusters, 
and we will derive the P{D) function from these simulated 
maps. Through this subsections, we will illustrate the P{D) 
curves for the angular resolution of 9b = 


T.l Modelling clusters using Press-Schechter 
prescription 

Statistical properties of the population of SZ clusters has 
been extensively studied in the literature (see SWH, fig.4 
in that paper, for a recent review). For the case of the 
power spectrum, the published estimates show differences 
of an order of magnitude, although these differences can be 
understood due to the different scaling relations and mass 
ranges considered in each case. Therefore, we will be inter¬ 
ested here in determining the qualitative behaviour of these 
new quantities {P{D), skewness and bispectrum). For this 
purpose, we will use a simple modelling of clusters, based 
on a Press-Schechter prescription. If someone is interested 
in predictions in agreement with hydrodynamic simulations, 
then it is possible to use r efined versions of the P S for- 
malism, as th a t desc ribed in ISheth fc Tormeiil (Il99flll , or in 
I.Tenkins et al.l (1200 Jl . In PS, the comoving number density 
of bound objects of total mass M at redshift z, is given by 


dn{M, z) 

[2 p da{M,z) 5 c 


dM 

(/ TT M dM a^{M,z) ^ 

2a^{M,z) 


(29) 


where p is the mean comoving background density, (j{M, z) 
is the variance of the linear density fluctuation field filtered 
on some mass M, p is the mean comoving background den¬ 
sity, and 5c is the linear density contrast of a perturbation 
that has virialized. 

In our m odelling, we will assume the value of 5c = 
1.686 (see e.g. iMolnar fc Birkinsl2awj j|200(jl)_jm the scal¬ 
ing of a{M,z) with mass from T^ana^ T;idd]el (^2^- As 
it was discussed in section 2, it is neccesary to introduce 
a mass cutoff because dn/dM diver ges at low masses. Fol¬ 
lowing iKom^s^^^Kit^^^ (II999II . we will use Mmin = 
5 X 10 ^^ h~^ Mq, and Mmax = 5 x 10 ^^ h~^ Mq. Changing 
the upper limit has little effect on the predictions because 
the PS function falls exponentially. The effect of changing 
the lower limit is discussed below. 

For i llustration, we have adopted h ere the ’concordance’ 
model of lOstriker fc Steinhardf] 1^9^, which has Qtot = 1, 
with Qrn ~ 0.3 and Qa = 0.7, n — 1, h = 0.67 (where 
Ho = lOO/i km s“^ Mpc“^), and the normalisation ag = 0.9. 
For the scaling relation of the gas temperature with the mass 
and the redshift of the cluster, we use the scaling pointed 
out in section 2, with 


M 


2/3 


IO15/1-IM0 


(1 -I- z) keV 


(30) 


and we will adopt a /3-model for the intra-cluster gas with 
/3 = 2/3. The relationship between the virial radius (r„) 
and the core radius (vc), and their scalings with mass and 
re dshift 2 : are those obtained assuming spherical collapse (see 

e.g lXtrio-Barandela fc Mucked I 1999 I : iMolnar fc Birkinshawl 
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l2000h . and an entropy-drive n model with e = 0 for the core 
evolution fe.g. lBowerlll99711 . The parameters of our cluster 
model are rv{z = 0) = Mpc, rRz = 0) = 0.13h~^ 

Mpc, and nRr = 0, 2 = 0) = 2 x 10~® cm“®. In Appendix 
m we study the dependence of our results on the scaling 
assumptions, as well as the dependence on the normalisation 
0 - 8 . 

We then generate 15 realisations of a l°-side map of SZ 
clusters using a Press-Schechter law, and a single class of 
clusters. We have chosen these values to allow direct com¬ 
parison with the results of of the hydrodynamic simulations 
of the thermal SZ effect described in SWH^, which corre¬ 
spond to 15 maps of the same previous size, of the Comp- 
tonization parameter y due to structure in the same ACDM 
model between 2 = 0 and 2 = 19. SWH computed the an¬ 
gular power-spectrum of the SZ effect from these maps, as 
well as the source counts of thermal SZ sources. The mean 
Comptonization parameter in our 15 Press-Schechter reali¬ 
sations is < y >= 2.1 X 10“®. 

7.2 Source counts for clusters following a 
Press-Schechter prescription 

When generating the previous 15 realisations, we keep the 
total flux of all the simulated clusters, as well as the core and 
virial radius, so we are able to find out the source counts for 
our maps. These source counts are presented in figureIn 
the top panel, we show these counts as a function of the flux, 
fo r the frequency of 30 GHz . As e xpected from the result 
of ide Luca. Desert, fc Pueetl (^9^, the slope of the source 
counts at fluxes greater than ~ 1 mjy corresponds to an 
Euclidean power-law (/3 = 2.5). However, the amplitude of 
our curve is different from theirs. Our source counts are well 
fitted by n(S') ~ l(S/lJy)~^'^ sr“^ at 30 GHz, and 
by n(S') « 28(S/lJj/)“^ ® sr“^ Jy~^ at 150 GHz, while their 
source counts at 150 GHz is n{S) ~ S.6{S/lJy)~^'^ sr“^ 
Jy“^. This difference is due to the fact that they introduce 
an extra normalisation factor in the PS mass function to fit 
the observed mass function in X-rays, and that changes the 
total amplitude. 

Our prediction for the source counts compares well with 
that of SWH (open squares), although it is clear that the hy- 
drodynamical simulations do not show an strong cut-off in 
flux, but they have much more ’small’ (low mass) objects. 
This excess of small objects respect to the PS result will be 
responsible for a larger power at small angular scales, as it 
has been discussed in SWH. We can see better this point 
in the bottom panel of figure where we show the source 
counts as a function of the angular radius of the cluster. We 
define here the angular radius as the radius which contains 
half of the total flux of the cluster. It is interesting to see 
that if we only consider clusters with a flux density greater 
than a given value, then the most brightest clusters in the 
PS maps are the largest ones in size. From this figure, we 
immediately see that for the angular resolutions of the up¬ 
coming SZ experiments {9b ~ more than 90% of the 
clusters will be unresolved objects (or even more, if we look 
at the SWH source counts). We would like to stress that 

^ These simulations are quoted in that paper as 134A, and have 
been corrected for the h factor. 



0.001 0.010 0.100 1.000 
S (mJy) 



6 (arcmin) 

Figure 6. Top: Differential source counts at 30 GHz from the 15 
simulated maps following the PS prescription (solid line). As ex¬ 
pected, the behaviour at fluxes 5 > 1 mjy is well described by a 
power-law with /3 = 2.5, and we have a strong cutoff at low fluxes, 
corresponding to the mass cutoff Mmin = 5x In this 

case, this power law is described by n{S) ~ l(S/lJy)“^ ® sr“^ 
Jy“^. For comparison, we also show (open squares) the results 
from SWH. These curves are also presented in Figure 4, in the 
context of the source counts for radio sources. Bottom: Differen¬ 
tial source counts from the 15 maps following a PS prescription, 
as a function of the source angular radius. We again use the co¬ 
ordinates \6dN/dO\, where we explicitly see the typical size of the 
most common sources. We present the source counts for the whole 
dataset (solid circles with solid line), and considering only sources 
with a flux at 30 GHz greater than a certain threshold. We can 
see that the most intense sources are also the largest ones. We 
also include the results from SWH (open squares) showing that 
the hydrodynamical simulations have a larger amount of small 
sources, and a wider curve. 

the coordinates that we are using to plot the source counts 
are specially suitable, because they directly give us the to¬ 
tal amount of sources at a given angular size/flux interval. 
In these coordinates, we clearly see that our PS modelling 
with Mmin = 5 X produces a peak of objects 

with fluxes around 0.04 mJy and sizes around 0.6', and be¬ 
low these quantities we have an strong cutoff (both in size 
and flux), as expected from the discussion on section 2. It is 
important to note that for the angular size of 0 = 0 . 6 ' (close 
to the resolution of APEX), we have around ~ 1000 sources 
per square degree, so given that the number of beam areas 
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Figure 7. Dependence of the differential source counts (as a 
function of the source angular radius) with the selected min¬ 
imum mass for the Press-Schechter mass function. Curves for 
^^min ^ 5 X 10^^h~^M q have been obtained from the 15 PS re¬ 
alisations used in figurel^ The curve for Mmin = 2 x 10^^/i ^Mq 
has been derived, just for illustration purposes, from an small re¬ 
alisation of 3 square degrees using this new low mass cutoff. We 
also show the source counts from SWH. 


inside a square degree is ~ 10^ for Ob = 0.6^ we have that 
at this angular scale we will obtain 1 sourc e every 10 beam s. 
These values are inside the confusion (e.g. IScheuej 1^^), 
so the P{D) function is a suitable tool to study them. For 
larger values of 9h (e.g. 1.3' for the South Pole Telescope, 
or ~ 10' for present day experiments), we will be even more 
inside confusion due to these small clusters. 

The natural question which follows now is to explore 
if this conclusion holds for different values of Mmin- For 
this purpose, we should study the dependence of the source 
counts as a function of the angular size using different values 
for Mmin in the simulations. This can be done using our 
simulations by selecting clusters with masses above the new 
threshold. This is done in figure|7| As we would expect, the 
9 value at which we have the maximum amount of sources 
(9peak) roughly scales as r„. Fitting our data, we have: 


^peak 


■- 0 . 6 ' 


M„ 


1/3 


5 X 1013 /i-iMq 


We can also infer from here the total amount of sources (per 
square degree) that we have at that angular scale [Np^ak), 
obtaining 

) - 2.8 

, —2 
eg 

Using these two last equations, we conclude that for the 
angular resolution 9peak, we expect 1 source every 10 x 
(MminI'h X MqY'^^ , which is smaller than 30 only 

for Mmin < 8.4 X lD^h~^MQ. 

Finally, we can see from figure |7| that all future experi¬ 
ments (with 9b ~ 1') will be confusion limited at this angular 
resolution if Mmin < 1.1 x Mq. For all of them, a 

P{D) analysis will be suitable, and in addition, this analysis 
will provide information about the low mass cutoff. 



9 , = 1 



Figure 8. First panel: P{D) function for a single map of SZ 
clusters following a Press-Schechter prescription (solid line), and 
for an hydrodynamic simulation from Springel et al. 2001 (dot- 
dashed line). Both maps are 1° on a side, with a pixel size of 
0 .12'; they have been smoothed with a gaussian beam of 6 b = 
1', and are given for the Rayleigh-Jeans region of the spectrum. 
These curves can be transformed into PDF curves for y parameter 
by the transformation D = —2yTnmbj so the P{y) function will 
exhibit a positive tail. The best gaussian fit for both curves is 
shown with dotted lines, being their widths cr = 6.1fj,K and cr = 
3.9fiK, respectively. Second panel: same functions but computed 
dividing the map by the rms prior to the P{D) computation. 
When rescaling by the rms, both curves show the same qualitative 
behaviour, with a nearly linear tail in these coordinates (log P(D) 
vs D). Long dashed line shows a straight line in these coordinates 
for a slope of unity. 


7.3 Predictions of the Press-Schechter 

approximation for P(D) function, skewness 
and bispectrum 

We will be now interested in obtaining the qualitative be¬ 
haviour of the P{D) function, the skewness and the bis¬ 
pectrum for SZ clusters following PS mass function. There¬ 
fore, and through this section and the next one, we will re¬ 
strict ourselves to the computed simulations, with Mmin = 
5 X 

• P(D) function. In figure|^we present the P(D) anal¬ 
ysis of the maps generated in both PS and hydrodynamical 
simulations using a beam size of 6(, = 1'. Usually, the maps 
of SZ thermal effect are presented as maps of y-parameter. 
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However, observers are seeing the brightness distribution on 
the sky. Therefore, we present deviations on the map for 
the Rayleigh-Jeans temperature. The temperature in the 
Rayleigh-Jeans region of the spectrum is connected with 
the y-parameter by the sim ple relation /S.D = —2yTcmb 
iZeldovich fc SnnvaevlITflfi^ . but the P(D) graphs are valid 
for any frequency: we only need to recompute D and P(D) 
according to formula GHJ. but using / instead of g. There¬ 
fore, our plots for P{D) are easy to convert into PDF for 
y parameter. This opens the possibility for comparing our 
resul ts for the PS approxima t ion w ith those of other authors 
(e.g. ISeliak. Rurwell. fc PenI ll2nnih 'l. 

The P(D) curves for both PS and hydrodynamic simu¬ 
lations show very broad non gaussian negative wings. To 
demonstrate this, we are presenting on figure IHK the best 
fit gaussian curves for both P(D) functions. It should be 
noted that the width of the best fit gaussian {a — 6.1/rK 
and a = 3.9/iK, respectively) is smaller than the rms of the 
maps {rms{PS) = 13.9/iK and rms{SWH) = 12.0/iK), be¬ 
cause the fitting reproduces the central part of the curve, 
while the rms is larger because of the negative tail. If these 
curves were gaussian, both the rms and the o value would 
be the same. The two P(D) curves have sufficient differences, 
specially in the slope of the negative tail. Both of them fall 
more rapidly than the gaussian fit, and the PS prescription 
produces an slightly broader curve (because of the slightly 
larger rms). We will not discuss these differences in detail 
because it is well known from previous studies that these two 
methods are giving significantly different results. However, 
it’s impressive that if we include the ’normalisation’ Djrms, 
dividing the maps by their rms before the P{D) computa¬ 
tion, the resulting P{D) curves are similar in shape, showing 
a practically identical slope in the intermediate asymptotic 
region (see lower panel on fig. n. 

• Intermediate asymptotic for the P(D) function. 
In a broad region, the normalised P(D) curve could be de¬ 
scribed by a simple analytical formula: the powerful left 
wing of the P(D) distribution is close to the straight line 
in the coordinates we used in our figure (log(P(D)) versus 
D). Therefore, we see that P(D) ~ exp{aD) in a sufficiently 
broad region of — 3 < Djrms < 0. This is an impressive 
feature of P(D) which might help to identify SZ clusters 
from the noise of the observed maps, given that this be¬ 
haviour is completely different from a gaussian. We should 
stress that both the Press-Schechter approximation and the 
hydrodynamic simulations produce practically the same ’’in¬ 
termediate asymptotic”. In addition, this ’a’ factor has an 
strong dependence with the normalisation as, as we show 
in the Appendix where we find that a ~ ■ The 

numerical coefficients for this expression are also shown in 
that appendix. 

It’s well known that the P{D) distribution for sources 
with power law source counts n(S) = KS~^ gives a sim¬ 
ple power law asymptotic P(D) ~ D~^ for high values 
of the deflection, because it that case the distribution be¬ 
comes dominated by strong, well-resolved sources. To look 
for this asymptotic, we presented the graph of the deriva¬ 
tive dlog{P{D))/d\og{—D) versus Djrms in Figure]^ In 
these coordinates the asymptotic P(D) ~ D~^ will be the 
horizontal line with d log (P(D))/ d log(-D) = -2.5. Figure 
l^shows that d\og{P{D))jd\og{—D) « aD-\-B in the range 
—2.5 < Djrms < 0, which corresponds exactly to the inter- 



D/'rms 

Figure 9. Slope of the negative tail of the P{D) function for SZ 
clusters following Press-Schechter prescription. It is shown the 
values for dlogP(Z))/dlog(—D), for the average of the 15 reali¬ 
sations of SZ clusters, in the Rayleigh-Jeans region of the spec¬ 
trum, and using a gaussian beam with 6 }^ = V (solid line). The 
dot-dashed line shows the same calculation but for the SWH sim¬ 
ulations, while the dotted-line shows the analytic calculation for 
a power-law source counts with n{S) = sr~^ 

The horizontal axis is plotted in terms of the rms of the map. The 
slope of this tail tends to the value of -2.5 (horizontal long-dashed 
line), as expected for a power-law differential source counts with 
/3 = 2.5. However, in this ’intermediate’ region, the P{D) curve 
follows a nearly exponential law (P(D) oc exp(aD/rms)), with 
a ‘Rt 1.0. This behaviour is completely different from that of a 
gaussian function. 


mediate asymptotic described above P(D)~ exp(aZ)). The 
limiting value of —13 = —2.5 is reached only for very large 
deviations. 

Finally, it is important to notice that this ’interme¬ 
diate asymptotic’ behaviour can be also obtained semi- 
analytically, by using a power-law source counts with f3 = 
2.5 inside equation 0 , as it is also shown in figure There¬ 
fore, this intermediate asymptotic is closely related with the 
PS prescription, which is the ultimate responsible of the Eu¬ 
clidean type power-law behaviour of the source counts for 
clusters (A^ ~ 5"®/=). 

• Skewness. We investigate the dependence of the skew¬ 
ness introduced by SZ clusters as a function of angular scale, 
both in real and Fourier spaces. In the real space, to obtain 
the skewness at a certain scale, we smooth the SZ map using 
a gaussian filter of FWHM 6),, and measure the skewness on 
the smoothed map. The results for modelling using Press- 
Schechter approximation are shown in Figure EHl Error bars 
correspond to the field-to-field variance from this ensem¬ 
ble of maps. In the same figure, a dot-dashed line shows 
the same calculation, but for the hydrodynamic simulations 
of SWH. It can be seen that the shape of the curve and 
the values from these two panels are similar for beam sizes 
larger than 9b ~ 2 ', although for very small angular scales, 
the hydrodynamical simulations produce a larger skewness, 
suggesting that it may be even easier to detect. This dis¬ 
crepancy at low angular resolution between PS and hydro- 
dynamical simulations has been already discussed by many 
authors when predicting the power spectrum, as we discuss 
in the following item. 
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Figure 10. Negative skewness due to SZ clusters as a function 
of the angular scale. These values have been obtained from 15 
(l°-side) simulated maps of the thermal SZ effect using a Press- 
Schechter prescription (see details in text). For each angular scale, 
the skewness is obtained after convolving the map with a gaus- 
sian beam of FWHM equal to 0^. The solid line shows the aver¬ 
age value of the skewness over these 15 realisations, while dotted 
lines show the 1-sigma error from this ensemble of maps. The 
dot-dashed line shows the same result, but obtained from the 15 
maps of SWH. The hydrodynamic simulations show an excess of 
skewness with respect to the PS modelling at small angular scales. 


• Bispectrum. We are presenting the ’skewness’ in 
Fourier space by computing the dimensio nless bispec¬ 
trum (If), using the estimator described in ISantos et alJ 
l|2nn2li . For completeness, we also derive the power 
spectrum (C^), although this computation for the case 
of a Press-Schechter prescription has been done by 
several authors (e.g. [ Atrio-Rarandela Kr, Mhcke^ h99Q[ 
iMolnar fc Birkinshawlboocl) . Here, we explicitly shown that 
SZ clusters provide a negative contribution to the bispec- 
trnm. Similar curves to the ones present ed here for th e 
skewness and bispectrum can be found in ICoora^ J2nn(il . 
The first panel on Figure HD shows the power spectrum re¬ 
sults for the PS case, in the Rayleigh-Jeans region of the 
spectrum. To allow comparison, it is also included an stan¬ 
dard ACDM power spectrum fro m primordial fluctua tions, 
and the measurements from CBI llMason et al.l (|2003)) and 
BIMA jPawson et alJ (|20o3))- The dot-dashed line shows 
the result from SWH, which has a similar amplitude to our 
PS modelling, although it peaks at larger I with a wider 
shape. This qualitative behaviour is common to all the hy- 
drodynamical simulations (see SWH for a review of the re¬ 
cent predictions), so they have more power at larger angular 
scales. 

The second panel in Figure mi shows the angular bispec¬ 
trum for our PS modelling. As expected, ’skewness’ both in 
real and Fourier space is negative, but the non-gaussianity 
in Fourier space is smaller than in real space. When com¬ 
paring our results with those using the SWH simulations, 
we find again that at small angular scales (high t) the hy- 
drodynamical simulations produce a larger skewness. 




Figure 11. Power spectrum and bispectrum for the thermal SZ 
effect in the Rayleigh-Jeans region of the spectrum, where the 
SZ clusters have the same spectrum as the primordial fluctua¬ 
tions (but are negative). First panel: Angular power spectrum 
(in (AT/T)^ units) for the SZ effect averaged over 15 simula¬ 
tions using the Press-Schechter prescription. Dotted lines shows 
the 1-sigma field-to-field dispersion. It is also shown a standard 
ACDM model, and the reported measurements of CBI (Mason 
et al. 2002, open squares), and BIMA (Dawson et al. 2002, filled 
circles). The dot-dashed line shows the result using the 15 hydro- 
dynamic simulations of SWH. Second panel: Angular bispectrum 
[Ig = for the same 15 simulations using PS prescrip¬ 

tion. Again, the dotted lines shows the 1-sigma field-to-field dis¬ 
persion, and the dot-dashed line the result using the 15 maps of 
SWH. As expected, ’skewness’ in Fourier space is also negative in 
this frequency range, but smaller than in real space. Both in the 
power and in the skewness, the hydrodynamic simulations show 
an excess of signal (in absolute value) respect to the PS modelling 
at large i. 


8 ROLE OF RADIO SOURCES 

We discuss now the influence of radio sources on the ob¬ 
served P(D), skewness and bispectrum. The formalism to de¬ 
scribe the confusion noise introduced in a map due to radio 
sources is widely known, and has been already presented in 
section This formalism is easy to extend to any moment 
of the observed map, in particular the skewness. Hence, if the 
population of sources at our observing frequency is described 
by the differential source counts ri{S) = K{S/lJy)~^ sr“^ 
Jy“^, then the confusion ’skewness’ on our map is given by 
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SkeWc 


<s^> 
< s2 >3/2 


(3 - 
4-/3 


{KQ,) 


- 1/2 


0-1 


(31) 


Note that this equation, with a minus sign, is also valid for 
negative clusters following a power-law source counts. We 
see that the contribution of radio sources to the skewness 
will depend on our source subtraction threshold, Sc, i.e. on 
the most brightest sources remaining in the map. In order to 
distinguish the signal coming from SZ clusters, it is necessary 
to decrease these quantities below the level of the SZ signal. 

It is obvious that when adding two maps (sources and 
SZ clusters in this case), both the power (< >) and the 

third-moment (< >) of the resulting distribution are 

the sum of the individual quantities, assuming uncorrelated 
maps. Let ai and (J 2 be the rms (power) of each of these 
two families of sources, and let Skewi and Skew 2 be the 
(dimensionless) skewness for each one of them. The rms of 
the combined map is + a^, and the skewness is 


oi — (Skewi) af + (Skew2) o"! 
= {gI + a2)3/2 


(32) 


Combining these two last equations, we can infer the re¬ 
quired flux threshold for the source subtraction in order 
to have an overall negative skewness in a map containing 
sources and clusters. However, it is also well-known that us¬ 
ing a single map we are unable to subtract sources down 
to an arbitrary flux level, because of the intrinsic confusion 
noise introduced by sources. As we have seen in section |21 
the minimum subtraction threshold is usually taken to be 
Sc = qGc, with q — 3 — 5. Inserting this condition in equation 
(03, we have 


Skew = g(3 — /3)/(4 — /3) 


(33) 


Using g = 5 in this equation, and in the equation for the 
confusion noise iII3, we obtain the minimum contribution 
of radio-sources to any experiment which does not consider 
any source subtraction strategy. In the particular case of 
/3 ~ 2, which is the case for the observed radio sources at 
30 GHz, we have Skew = q/2, so the minimum skewness 
due to sources that we expect without source subtraction is 
~ 1.5 — 2.5 for g = 3 — 5, and the minimum confusion noise 

Gc = 2.86{q/3){K/90Jy~^ sr~^){8i,/10')^mjy 


If we now convert this into temperature using the Rayleigh- 
Jeans approximation, we obtain 

o-c = 10.8iq/3){K3OGH./90Jy-'^sr-^){i'/30GHz)-^yK 

which is independent of the beam size because /3 = 2. Ob¬ 
serving at a single frequency and with a single instrument, 
allows us to go down to q=3 at the most. If we want to go 
deeper, we need to use information from an instrument with 
a better angular resolution to decrease the effective value of 
g, as we see from equation l|T3. 

We illustrate in figure lT^ how a source subtraction tech¬ 
nique affects the observed P{D) function of a map con¬ 
taining sources and SZ clusters. Radio sources are mod¬ 
elled here with a power-law source counts with parameters 
K = 92 Jy~^ sr“^ and (3 = 2. We have considered differ¬ 
ent source subtraction thresholds, parameterised in terms of 
mx crc(g = 5). Without source subtraction at all, we can not 
decrease m below 5. The main result is that, as we would 
expect, for small values of m we delete the strong positive 


e, = 1 



D (,J.K) 

e, = / 



D (,rK) 

Figure 12. Top: Effect of the source subtraction on the P(D) 
function of sources. We parameterise the source subtraction limit 
as Sc = m Gc{q = 5), where Gc{q = 5) = 16.6/rA in this particular 
case. Here, we consider the following values for m: 1, 3, 10 and 
50. All these calculations correspond to radio sources described 
by parameters if = 92 sr“3 Jy“l and /3 = 2, and using a gaussian 
beam of df, = 1'. Bottom: Same as in the top panel, but adding a 
realisation of SZ clusters following a PS prescription. In all cases 
the negative tail is visible, but the asymmetry and the skewness 
are positive for m values greater than 3, so the detection of the 
SZ component will require a P{D) analysis if we do not consider 
a source subtraction strategy. Simply measuring the skewness is 
not enough to see the negative contribution. 


deviations, so P{D) becomes more narrow and permits to 
look for a negative tail connected with SZ clusters. In par¬ 
ticular, we see that the negative tail is always visible, but 
the asymmetry and the skewness are positive for m > 3, so 
a simple analysis of the skewness will not be able to detect 
a negative contribution if we do not subtract radio sources 
from the map. 

We will now discuss how this picture changes with the 
observing frequency and the beam size, assuming that we 
do not subtract sources at all, so their contribution will be 
at least q = 5 (i.e. Sc = 5Gc{q = 5)). We present in figure IT!I1 
the P(D) function for the 15 PS realisations, with and with¬ 
out adding to the maps a simulation of sources with K = 92 
Jy“^ sr“^ and (3 = 2. Just for illustration, it is also com¬ 
puted the P(D) function extrapolating the radio sources up 
to 100 GHz using a = 0.5. As it has been pointed out above. 
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Figure 13. Top: P{D) function for the 15 (l°-side) PS realisa¬ 
tions plus sources, assuming that we are able to subtract radio 
sources down to Sc = 5crc(Q = 5). We calculate sources using 
the parameters K = 92 sr“^ Jy~^ and /3 = 2, and we present 
the results both at 30 GHz and at 100 GHz. The extrapolation 
of radio sources up to 100 GHz has been done using an spec¬ 
tral index a = 0.5. We can see that, without a source subtrac¬ 
tion strategy, radio sources become of the same importance as, or 
even much more important than clusters at this angular scale for 
u = 30 GHz, while they practically disappear at 100 GHz. Bot¬ 
tom: P(D) function for clusters and sources as a function of the 
beam size. We use the 15 PS realisations, plus sources as in the 
top panel, and we compute the P(D) function at 30 GHz. As ex¬ 
pected from the simple inspection of figure|2 at larger beam sizes 
sources are more numerous, so their tail dominates the asymme¬ 
try. At smaller beam sizes, clusters become of importance. 

at higher frequencies we would expect other populations of 
sources, so the real P{D) function might show a bigger pos¬ 
itive wing. We can see that at these angular scales {9b = 1^), 
and without a source subtraction strategy, clusters can be 
(at the most) of the same importance of radio sources at 30 
GHz. This fact is well-known, and several planned experi¬ 
ments that will operate close to these frequencies (30 GHz) 
also have designed a source subtractio n strategy to elimina te 
the radio source confusion (e.g. AMI, iKneissl et alJl^O jH . 

In the bottom panel of figure lTi^ we illustrate the depen¬ 
dence of the shape of the P{D) as a function of the beam 
size {Ob). As we would expected from the simple inspection 
of figure^ and as we pointed out at the beginning of section 
[7|(see figure 1 ^, at larger beam sizes radio sources are more 


numerous than clusters, so their tail dominates the asymme¬ 
try. At smaller beam sizes, clusters become of importance. 

It is important to note here that all these calcula¬ 
tions have been done assuming uncorrelated sources, but 
in principle one could have two types of correlations: spa¬ 
tial correlations between sources (SZ clusters) themselves, 
and correlations between the positions of the SZ clusters 
and the radio sources. In the first case, spatial cor relations 
of so urces/clusters can be modelled to hrst order iBarconj 
Il992^ as an extra convolution with another gaussian. Thus, 
the above tools can still be applied, although it will re¬ 
quired a much more detailed study. On the other hand, it 
is well-known tha t clusters of galaxies may contain r adio 
point sources (e.g. lBirkinshawlll99fll : iGoorav et_alj [jj9^h_so 
the signal from SZ cluste rs could be diluted ' iEoid^ 2002 bt 
iLin. Ghiueh. fc “^ 1200211 . This point can be easily checked 
by introducing spatial correlations between clusters and ra¬ 
dio sources. In the most unfavourable case of having a radio 
source inside each cluster, we obtain from the previous sim¬ 
ulations that skewness is underestimated by about ~ 20 %. 
This number i s in agreement with the result obtained by 
lHoldeI^ i2002lJ) . who showed that the rms fluctuations of 
the thermal SZ effect could be underestimated due to cor¬ 
relations between clusters and radio sources by as much as 
30% at the observing frequency of 30 GHz and i > 1000. 


9 DETECTION OF A NEGATIVE 

CONTRIBUTION OF CLUSTERS 

The performance of the estimators described on section|^for 
detecting a negative contribution of clusters depends on the 
particular shape of the source counts for both radio sources 
and SZ clusters. In this subsection, we will concentrate on 
showing that the skewness may be used to distinguish the 
nature of the subjacent fluctuations, using for that purpose 
a toy-model for sources and clusters. Once we are able to 
detect skewness in the map, the P{D) function will show an 
asymmetry. However, the determination of the P{D) func¬ 
tion of the underlying sources will require additional obser¬ 
vational effort than simply measuring the skewness. Thus, 
the skewness will give us the minimum signal-to-noise level 
required to distinguish between clusters and radio sources. 

We illustrate these facts by computing the P{D) func¬ 
tions and the skewness for the complete set of 15 SZ maps 
based on PS modelling, observed with a gaussian beam of 
9h = 1', and adding different noise levels per pixel (= (Jnoise), 
quoted in terms of the power due to clusters {crsz)- When 
then consider the following cases for (Jnoise/o'sz- 0.3, 1, 2, 
5, 10 and 20. We will also assume that the noise is gaus¬ 
sian and uncorrelated, as a first approximation. The result¬ 
ing P(D) curves are shown in figure 1141 while the mea¬ 
sured rms in the map, the recovered excess of power, and 
the observed skewness, are quoted in Table |21 We can see 
that the tail of the P(D) function has practically disap¬ 
peared at anoisejo-sz = 5, although is still possible to mea¬ 
sure skewness in the map at a high significance. When we 
go to (Tnoiselo'sz = 10 , the skewuess is no longer visible, 
although it is possible to detect an excess of power. At 
o’noiselo'sz = 20 , anything can be detected. 

If we now include sources in our maps, the determi¬ 
nation of how deep we need to integrate in order to get 
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Table 2. Detectability of the skewness when adding white gaus- 
sian noise to SZ maps. 


^noise / ^ SZ 

rms (^A) 

E[(7sz] itkK) 

E[Skew] 

^ Skew 

0.3 

15.10 

14.46 

-3.320 

0.003 

1.0 

20.43 

14.44 

-1.340 

0.003 

2.0 

32.32 

14.43 

-0.340 

0.003 

5.0 

73.67 

14.15 

-0.030 

0.003 

10.0 

145.35 

14.84 

-0.001 

0.003 

20.0 

289.78 

18.67 

0.0004 

0.003 


We consider an SZ simulated map following PS prescription, cov¬ 
ering 3 square degrees with a pixel size of 0.12' and a beam size of 
= 1', with (Tsz = 14.46/i7^, and we add different noise levels. 
The obtained P(D) are shown in Figure[3| while we present here 
the observed power (rras) in the map, the infered signal due to 
SZ (estimated as E[<jsz\ = yj rms^ — ), and the measured 

skewness in the map (without applying any filtering technique), 
with its variance {(rskew)' 


e,= r 



Figure 14. Shape of the P(D) function for clusters when adding 
instrumental gaussian white noise. A single map of 3 square de¬ 
grees of SZ clusters following PS prescription has been used, with 
usz = 14.46 iiK. We add to this map several realizations of white 
gaussian noise, with the amplitude quoted in the figure, and then 
we compute the P(D) function. The measured excess of power 
and skewness in each particular case are quoted in Tabled 


information about the skewness can be done in the follow¬ 
ing simple way. We define here the signal-to-noise ratio per 
pixel, S/N^ as the quotient (Jsignal{(ynoise- We quote this 
quantity because it is easy to infer from the map. In addi¬ 
tion, it is also straight-forward to convert these values into 
integration time t for a given experiment, because we usually 
have (Tnoise oc y/t. In the previous expression, (Jsignai corre¬ 
sponds to the power introduced by clusters and sources all 
together. Then, if we want a g-sigma detection of skewness, 
it is straight-forward to derive from equation that we 
need a signal-to-noise ratio better than 



6g2 


1/3 


- 1 


- 1/2 


(34) 


where Skew is the combined skewness of sources and clus¬ 
ters lEHl, and Npix is the number of pixels in the map, i.e. 
Npix = 4:TTfaky/^p, with fsky the fraction of sky covered, and 


Qp the pixel solid angle. For the typical values of skewness 
found in our simulations, the required signal-to-noise ratio 
for the detection of skewness due to both point sources and 
SZ clusters is around 25-40% times larger than the required 
ratio for the detection of an excess of power. In other words, 
we need (the well-known result of) 1.5-2 times more inte¬ 
gration time to detect skewness than to detect an excess of 
power. 

We will mention here that these values have been ob¬ 
tained directly from the maps, without applying any filter¬ 
ing. These numbers can be improved if we apply to our maps 
an appropriate filter to enhance the contribution of sources 
over the noise level, previously to the skewness or P(D) com¬ 
putation. We will discuss this issue in the next section, where 
we also consider the effect of primary anisotropies on the 
maps. 


10 INCLUDING PRIMARY ANISOTROPIES 

The computations in the last two sections were done assum¬ 
ing that the primordial CMB gives a negligible contribution 
compared with that of the SZ signal and the noise. However, 
this is only true at arcminute scales, where the primordial 
anisotropy is damped to very small amplitudes, well-below 
the level of the expected SZ anisotropy (see the top panel 
of figure ITTt . Therefore, only those experiments which are 
observing at the high multipole region of the angular power 
spectrum are going to directly observe the cluster contri¬ 
bution. This is the ca se of interferometers with long base- 
lines (e.g. AMIBA, s ee l^ang. Pen fc Wandl2002L or BIMA 
jPawson et alJl2002^ . which has a primary beam of 6.6'), or 
experiments with small fields of view. As it is well-known, 
if we are observing small patches on sky, we are sensible 
to multipoles greater than £min ~ ‘i'KiOmao, be i ng Orn a-p the 
size of our map (see e.g. iHobson fc Magueiiol lll99(ji for a 
detailed study). Therefore, for small fields of view we are 
implicitly filtering out the lowest multipoles, and we can 
directly apply the formalism described in the last section. 

However, if our experiment is sensitive to low mul- 
tipoles {I < 2000), then it is necessary to add a new 
step in our pipeline, which consists in filtering these low 
multipoles, if we want to remove the contribution of pri¬ 
mordial anisotropies. This will be the case for experi¬ 
ments which will cover thousands of square degrees in 
sky (e.g. 8m South Pole telescope), or experiments cov¬ 
ering the whole sky (e.g PLANCK satellite). This step 
is common to all techniques which are aimed to extract 
sources/clusters from a map using a single frequency ob¬ 
servation. Many filters can be found in literature, go ¬ 
ing from matched filters (e.g. iHaelmelt fc Teemarkl lll99(t) : 
TegniarlL.£g_de_QliyeiracCosta| 1^98 1), pseu do-filters (e.g. 
SanzMfrarajiz ^fc Mhrtfnez-Gonzalez J200lll ') or wavelets 
(e.g. lCav^ Any of these methods can be used 

to pre-process the maps, reducing the relative contribution 
of primary anisotropies, and also that of the noise. The im¬ 
portant point is that under linear filters, gaussian data will 
still be gaussian, and a distribution of sources still will show 
an skewed shape. 
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10.1 Effect of primary anisotropies on the 
observed P(D) 

For illustrating this issue, we have performed a new simula¬ 
tion of SZ clusters, covering a larger area (2048 x 2048 pixels 
of 0.5'side each, so the map covers ~ 291 square degrees), 
using the same parameters as for the previous simulations, 
but with Mmin = W^^Mq. The reason of considering such a 
large area is because future experiments will cover hundreds 
(or thousands) of square degrees on sky. We add to this map 
a CMB realisation following the power-spectra plotted in the 
top panel of figureim The rms of the simulated SZ map and 
the CMB realisation are 5.7fiK and 112.3^71', respectively. 
Finally, we also add white gaussian noise with an amplitude 
lOuK (per pixel of 0.5'), and we smooth the resulting map 
with a gaussian beam of Ob = 1'. Using this map, we have 
examined two different filters: 

• a Hanning (or high-pass) filter, which removes the con¬ 
tribution of all multipoles below a certain value io, but lin¬ 
early increasing from 0 to 1 in the range [Iq — A£, io + Af] 
to avoid the ringing associated to a sharp cut. We have used 
here io = 2000, and Ai = 50. 

• a matched filter optimised to detect point-source like 

objects llHaehnelt fc Tegmarklll99^ . so We oc {BeCj^"^)~^. 
Here, We and Be are the coefficients in a Legendre polyno¬ 
mial expansion of the filter and of the beam, respectively, 
and is the sum of the power spectrum of the compo¬ 

nents to be removed (noise and CMB in this case). 

In Figure IT^ we show the predicted P{D) function from our 
simulation when adding CMB primary anisotropies, with 
and without noise, and the predicted P(D) after filtering 
the maps with the previous Hlters. For comparison, it is also 
shown the P{D) function for the SZ map alone. In order to 
use the same scale for plotting all the curves, we have divided 
each map by its rms prior to the P{D) computation. We can 
see that a simple high-pass filter is able to remove the main 
contribution from the primordial CMB, so the P{D) func¬ 
tion becomes dominated by the contribution of SZ clusters, 
and the negative tail is visible. However, an small residual 
signal of the CMB fluctuations still persists, even in the case 
where we do not consider noise. Nevertheless, this residual 
can be modelled as an extra gaussian noise: a P{D) anal¬ 
ysis does not distinguish two signals if both of them are 
gaussian. When using the optimal filter, the P{D) function 
shows its tail more clearly. Therefore, when pre-processing 
our maps with a filter, we are able to reduce the problem 
to the case discussed in the previous section, where we only 
have SZ signal, sources and noise, and hence the negative 
tail is showing us the presence of clusters. For completeness, 
we report the number of pixels which are above 3 sigmas in 
these maps, to illustrate the significance of the “detection” 
of the tail. For the case which includes noise, these num¬ 
bers are 7500, 21503, 40475 and 76345 for the non-filtered, 
the Hanning filtered, the optimal filtered and the pure SZ 
maps, respectively. These numbers have a sampling error of 
106 pixels. For a gaussian distribution, we should expect to 
have 11324 pixels (0.27% for 3 (t) out of 2048^ in total. 

It is also interesting to demonstrate what we would ex¬ 
pect if a given survey is covering an small area on sky, but 
large enough to have an important contribution from pri¬ 
mary anisotropies. This would be the case of an hypothetic 




Figure 15. Top: P(D) function from a Press-Schechter SZ real¬ 
isation of a 291 square degrees patch of the sky, observed with a 
gaussian beam of 6(, = 1', for four cases: (a) adding CMB primary 
anisotropies, and white gaussian noise of 10 fiK per pixel (solid 
line); (b) filtering the previous map with the optimal matched- 
filter described in Haehnelt & Tegmark (1996) (dashed line); 
(c) filtering the map with a Hanning (high-pass) filter (dotted 
line); (d) considering the SZ map alone (dot-dashed line). Bot¬ 
tom: Same as the previous panel, but without including noise. 
In order to use the same scale for plotting all curves, we have 
divided each map for its rms prior to the P(D) computation. In 
this case, rms(SZ) = 4.9/iif, rms(SZ + CMB) = 112.2/^/^ and 
rms{SZ + CMB + Noise) = 112.3 ^j,K. 


experiment covering l°-side patch on sky, so we have used 
here the simulations described in section ITTI In this case, we 
have the additional problem of the sample variance: the ob¬ 
served P(D) function will show deviations from the expected 
gaussian curve. Therefore, filtering in this case is absolutely 
necessary. We illustrate this point with figure 1161 where we 
show the P(D) function from a single l°-side map, contain¬ 
ing SZ signal, CMB and no noise, and also the same plot 
for the average of 15 maps of the same type. We can see 
that, although the P(D) function for the map without filter¬ 
ing is marginally showing a negative tail when we average 
15 maps, the sample variance is completely deleting this tail 
when we consider a single map. This point can also be il¬ 
lustrated if we use those 15 l°-side simulations to compute 
the uncertainty introduced by the sample variance in the 
number of pixels above the 3a threshold. This number was 
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Figure 16. Effect of the CMB sample variance in the observed 
P(D) function. Top: P(D) function from a single Press-Schechter 
SZ map of l°-side, observed with a gaussian beam of = 1', 
for the cases (a), (b) and (d) discussed in figure IT^ We can see 
that the observed P(D) function is distorted due to the sample 
variance contribution of the primordial CMB anisotropies. The 
dotted line corresponds to a gaussian with a width equal to the 
observed rms in the map, i.e. the P{D) function that we would 
observe if all the signals in the map were gaussian and we had no 
sample variance. Bottom: same as previous panel, but averaging 
15 realisations. In this case, the sample variance is reduced and a 
negative tail is marginally detected even without filtering. In this 
case, rms{SZ) = 12.3^K and rms{SZ + CMB) = 54.5/ii^ in 
the top panel and r7ns{SZ) = 13.9fiK and rms{SZ + CMB) = 
74:.2{iK in the bottom one. 


Figure 17. Effect of primordial CMB anisotropies on the ob¬ 
served power spectrum/bispectrum. We use the same 15 realisa¬ 
tions as in figure [TTl but adding a CMB realisations to each one 
of them, following the plotted ACDM model. Here we do not ap¬ 
ply any filtering to the maps. Top: The recovered power spectrum 
from the 15 realisations (diamonds). Error bars correspond to the 
field-to-field variance. At low ^-values, the power spectrum traces 
that of the CMB, while in the high i region, the spectrum traces 
that of the SZ maps (dot-dashed line). Bottom: Bispectrum for 
the same 15 realisations (solid line). Dotted lines show the 1-sigma 
field-to-field variance. We see that at high the bispectrum fol¬ 
lows that of the SZ signal (dot-dashed line), but a low-£ (large 
scales) its goes to zero because the power becomes dominated by 
that of the primordial CMB, which has zero skewness. 


found to be 0.5%, whilst the expected number of pixels for 
a gaussian distribution above 3a is 0.27%. However, even in 
this case the filtering is able to remove the main contribu¬ 
tion of the primary anisotropies, and hence the negative tail 
is seen. 


10.2 Effect of primary anisotropies on the 
observed skewness/bispectrum 

We conclude this section showing the expected skewness and 
bispectrum from our 15 PS simulations taking into account a 
CMB component in the maps. It is important to notice here 
that for the bispectrum is not necessary to use any filter, 
because it directly gives the ’skewness’ at each angular scale. 


These results for the power-spectrum and the bispectrum 
are shown in figure IT7I From here, we can conclude that 
if the relative contribution of SZ clusters to the observed 
bispectrum is larger than the one from radio sources, then we 
expect to directly see their negative signal in the bispectrum 
without pre-processing the maps. 

To obtain the skewness in real-space, we proceed as in 
section 1731 but pre-processing the maps with any of the 
filters described in the last subsection. The obtained results 
from averaging over the same 15 PS maps are shown in hgure 
Hi The main conclusion is that the residual CMB power in 
the filtered map is diluting the SZ skewness, although it 
is still clearly seen. This is exactly what we would expect, 
using equation l37l . From simple inspection of the top panel 
of figure 1T7I we expect to have a residual CMB power after 
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Figure 18. Negative skewness for SZ clusters in the presence of 
primary CMB anisotropies, as a function of the angular scale. We 
present the skewness from the same 15 PS maps as in figure uni 
with (solid) and without (dot-dashed) adding primordial CMB 
anisotropies. When the CMB is added, the maps are filtered, prior 
to the skewness computation, using a Hanning filter with io = 
2000, as described in the text. Although the skewness in real map 
is diluted due to the residual power from the CMB in the filtered 
map, it is still detectable. 


filtering with a Hanning filter of io = 2000 which is similar, 
or slightly smaller than the SZ power. Therefore, we would 
expect the skewness to be reduced (roughly) by a factor 

23/2 

« 2.8, as we observed from the simulations. 


11 CONCLUSIONS 

In this paper we have discussed hve statements: 

• The contribution of SZ clusters to the map noise at 
A > 1.25 mm does not depend on the wavelength, and has a 
strong and peculiar non-gaussian negative tail in the P{D) 
function. 

• This contribution has characteristic negative skewness 
(or bispectrum) at A > 1.25 mm. This fact can be used by 
current single-frequency experiments, such as CBI or BIMA, 
or by future experiments, such as ACT, AMI, AMIBA, 
APEX, or the 8-m South Pole telescope, to distinguish if 
the detected excess of power at small angular scales is due 
to SZ clusters. In addition, the detection of skewness only 
requires a factor 1.5 or 2 more integration time than the de¬ 
tection of an excess of power. Once the skewness is detected, 
the P{D) function starts to show an asymmetry. 

• Any multi-frequency experiment would have noise due 
to clusters of galaxies with P{D) at A < 1.25 mm equal to 
P{—D) at A > 1.25 mm. 

• Skewness and bispectrum will have different signs in 
these two spectral regions. 

• When dealing with real maps where primordial CMB 
fluctuations are present, it is necessary to use filters to re¬ 
move the contribution of large angular scales. Only in that 
case we can detect the presence of clusters/sources in the 
P{D) function. 
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APPENDIX A: DEPENDENCE OF THE P(D) 
FUNCTION ON THE SCALING ASSUMPTIONS 
AND THE NORMALISATION as 

The scaling relation for the gas temperature used in 
this work was obtained from fittings to X-ray observa¬ 
tions. However, we could also have used that one derived 
from the spherical collap se model, which takes the form 
iEke. Cole, fc Frenklll99^ : 

/ \ 2/3 / \ 1/3 

7.75 ( M \ J Ho A, 

f3 [ IQis/i-iMo I ^ 


(X? 


kTaas = 


n{z) 178 


fceU(Al) 


where Ac is the ratio of the mean density inside the virial 
radius to the critical density, and j3 is the ratio of the 
specific galaxy kinetic energy to specific gas thermal en- 
ergy. This scaling ha s been used by other autho r s (e.g . 
iKomatsu fc Kitavamal Il99fll : iMolnar fc Birkinshawl l200(li . 
We will consider this scaling here to recompute the sim¬ 
ulations, and we will show that the qualitative behaviour 
of the P(D) function and the skewness remains unchanged. 
We will adopt here /3 = 1, the same parameters describing 
the cluster (r„o = Mpc, rco = 0.13/i“^ Mpc, and 

rico = 2 X 10~® cm“®), and the same cosmological model 
as in the main text (Q.m = 0.3, Ha = 0.7, h = 0.67, and 
as = 0.9). We integrate the PS mass function in the same 
range as in the main text. 

Concerning the core evolution, in this work we have 
used an entropy-driven model with e = 0, which is described 
as: 

, - 1/6 / s - 1/12 


rc(M, z) = Tco 


M 


IO15/1-IM0 


Ho Ac 

H(«)T78 


{1+z 


but without using the Ac factor. Therefore, we now repeat 
the computation taking it into account. A detailed study of 
how the power spectrum changes when assuming different 
core evolution models (a self-similar collapse, or an entropy- 
driven mod el with different values f or the e parameter), can 
be found in iKomatsu fc Kitavamal lll999ll . 

For definiteness, we will use the notation ’Model A’ for 
the scaling relations used in the main text (derived from fit¬ 
tings to X-ray observations), and ’Model B’ for the new scal¬ 
ing relations considered in this appendix. Figure ixn shows 
the resulting power spectrum (Ce) and bispectrum (J^) for 
these two models. We can see that the qualitative behaviour 
of the bispectrum is the same. 

Figure shows the P(D) function for these two mod¬ 
els. Despite the different width of these curves (due to the 
different total power in the spectrum, see figure ED, when 
rescaling them by their a (a^ = f D^P(D)dD), the shape 
of both curves is very similar. 

Finally, we have explored how the shape of the P{D) is 
sensitive to the as value. We then obtain 15 simulations for 
each one of the following values of as- 0.6, 0.7, 0.8, 0.9, 1.0 
and 1.1, keeping the same cosmological model (fl-m = 0.3, 
Ha = 0.7 and h — 0.67). For the scaling of the temperature 
and core radius with mass and redshift, we adopt the values 
of the ’Model B’ described above. 

It is well-known that the SZ a ngular power spectrum 
scales as Ct oc ag ® (see, e.g. iKomatsu fc Kitavamal 
il999(B . In our particular case, our power spectra derived 
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Figure Al. Top: Angular power spectrum for the SZ effect de¬ 
rived from Press-Schechter prescription, for two different models. 
These models (A and B) correspond to the same cosmological 
model, same normalisation (crs = 0.9), but different scaling rela¬ 
tions for the temperature and the core radius with the mass and 
redshift of the cluster (see details in text). The curves were ob¬ 
tained averaging 15 simulations (Inside) for each case. Bottom: 
Same as before, but for the bispectrum. 


shows the obtained power spectra and bispectrum for these 
values. The shapes of the bispectrum curves change slightly 
when changing the value of cjs. However, we can find an 
scaling for a given multipole. Given that SZ power spec¬ 
trum peaks around £ ~ 3000, we have done the study for 
£ = 3000, obtaining that the bispectrum scales as l£ oc 

The corresponding P(-D) functions are shown in fig¬ 
ure using a beam size of Ot = 3'. Given that we are 
particularly interested in the intermediate asymptotic re¬ 
gion, we obtained the scaling of the ’a’-factor in that region 
{P{D) ~ e“^). From our simulations, we have 


a « 0.055 



\pK- 


(A3) 


for the region 1' ^ Ob ^ 5' and 0.6 Si erg ^ 1.1. This is pre¬ 
cisely the scaling with erg that we would expect at first order, 
given that the shape of the P{D) function is roughly propor¬ 
tional to the a of the SZ map, and the latter is proportional 
to . For illustration, in the lower panel of figure IXdl we 


from the simulations shows an scaling Ce ~ nf. Figure show the different P{D) curves, rescaled by D ^ Derg 
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e, = 3' 




D/o 

Figure A2. Top: P(D) functions for the two models considered 
in figure ixn smoothing the maps with a gaussian beam of size 
= 3'. Bottom: Same figure, but rescaling the curves by their a 
(obtained as = J D^P(D)dD. 

for the case Ob = 3'. As we can see, their asymptotics become 
similar. 




Figure A3. Top: Power spectrum of the SZ effect for different 
normalisations (as values). Each curve has been obtained from 
averaging 15 realizations of l°size (error bars are not shown). 
For comparison, it is also shown the corresponding primordial 
CMB power spectrum, and the current observational measure¬ 
ments from CBI and BIMA. Bottom: Bispectrum of the SZ effect 
for the same values of erg considered above. 
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e, = 3' 



9, = 3' 



Dx(a,/0.9)-“ (iJ.K) 


Figure A4. Top: P(D) curves for the same realizations con¬ 
sidered in figure IK3\ The maps have been smoothed with a 3“^ 
FWHM gaussian beam. Bottom: Same P(D) curves, but rescaled 
hy D . We can see that now the asymptotic regions are 

parallel. 
















